Recurrent Neural Network Regularization

نویسندگان

  • Wojciech Zaremba
  • Ilya Sutskever
  • Oriol Vinyals
چکیده

We present a simple regularization technique for Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units. Dropout, the most successful technique for regularizing neural networks, does not work well with RNNs and LSTMs. In this paper, we show how to correctly apply dropout to LSTMs, and show that it substantially reduces overfitting on a variety of tasks. These tasks include language modeling, speech recognition, image caption generation, and machine translation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A constrained regularization approach for input-driven recurrent neural networks

We introduce a novel regularization approach for a class of inputdriven recurrent neural networks. The regularization of network parameters is constrained to reimplement a previously recorded state trajectory. We derive a closed-form solution for network regularization and show that the method is capable of reimplementing harvested dynamics. We investigate important properties of the method and...

متن کامل

Noisin: Unbiased Regularization for Recurrent Neural Networks

Recurrent neural networks (rnns) are powerful models of sequential data. They have been successfully used in domains such as text and speech. However, rnns are susceptible to overfitting; regularization is important. In this paper we develop Noisin, a new method for regularizing rnns. Noisin injects random noise into the hidden states of the rnn and then maximizes the corresponding marginal lik...

متن کامل

Tikhonov Regularization for Long Short-Term Memory Networks

It is a well-known fact that adding noise to the input data often improves network performance. While the dropout technique may be a cause of memory loss, when it is applied to recurrent connections, Tikhonov regularization, which can be regarded as the training with additive noise, avoids this issue naturally, though it implies regularizer derivation for different architectures. In case of fee...

متن کامل

A Recurrent Neural Network Model for solving CCR Model in Data Envelopment Analysis

In this paper, we present a recurrent neural network model for solving CCR Model in Data Envelopment Analysis (DEA). The proposed neural network model is derived from an unconstrained minimization problem. In the theoretical aspect, it is shown that the proposed neural network is stable in the sense of Lyapunov and globally convergent to the optimal solution of CCR model. The proposed model has...

متن کامل

A Recurrent Neural Network to Identify Efficient Decision Making Units in Data Envelopment Analysis

In this paper we present a recurrent neural network model to recognize efficient Decision Making Units(DMUs) in Data Envelopment Analysis(DEA). The proposed neural network model is derived from an unconstrained minimization problem. In theoretical aspect, it is shown that the proposed neural network is stable in the sense of lyapunov and globally convergent. The proposed model has a single-laye...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1409.2329  شماره 

صفحات  -

تاریخ انتشار 2014